170 research outputs found
Bayesian Hierarchical Modelling for Tailoring Metric Thresholds
Software is highly contextual. While there are cross-cutting `global'
lessons, individual software projects exhibit many `local' properties. This
data heterogeneity makes drawing local conclusions from global data dangerous.
A key research challenge is to construct locally accurate prediction models
that are informed by global characteristics and data volumes. Previous work has
tackled this problem using clustering and transfer learning approaches, which
identify locally similar characteristics. This paper applies a simpler approach
known as Bayesian hierarchical modeling. We show that hierarchical modeling
supports cross-project comparisons, while preserving local context. To
demonstrate the approach, we conduct a conceptual replication of an existing
study on setting software metrics thresholds. Our emerging results show our
hierarchical model reduces model prediction error compared to a global approach
by up to 50%.Comment: Short paper, published at MSR '18: 15th International Conference on
Mining Software Repositories May 28--29, 2018, Gothenburg, Swede
Rank-normalization, folding, and localization: An improved for assessing convergence of MCMC
Markov chain Monte Carlo is a key computational tool in Bayesian statistics,
but it can be challenging to monitor the convergence of an iterative stochastic
algorithm. In this paper we show that the convergence diagnostic
of Gelman and Rubin (1992) has serious flaws. Traditional will
fail to correctly diagnose convergence failures when the chain has a heavy tail
or when the variance varies across the chains. In this paper we propose an
alternative rank-based diagnostic that fixes these problems. We also introduce
a collection of quantile-based local efficiency measures, along with a
practical approach for computing Monte Carlo error estimates for quantiles. We
suggest that common trace plots should be replaced with rank plots from
multiple chains. Finally, we give recommendations for how these methods should
be used in practice.Comment: Minor revision for improved clarit
Rank-normalization, folding, and localization: An improved for assessing convergence of MCMC
Markov chain Monte Carlo is a key computational tool in Bayesian statistics,
but it can be challenging to monitor the convergence of an iterative stochastic
algorithm. In this paper we show that the convergence diagnostic
of Gelman and Rubin (1992) has serious flaws. Traditional will
fail to correctly diagnose convergence failures when the chain has a heavy tail
or when the variance varies across the chains. In this paper we propose an
alternative rank-based diagnostic that fixes these problems. We also introduce
a collection of quantile-based local efficiency measures, along with a
practical approach for computing Monte Carlo error estimates for quantiles. We
suggest that common trace plots should be replaced with rank plots from
multiple chains. Finally, we give recommendations for how these methods should
be used in practice.Comment: Minor revision for improved clarit
Quantitative comparisons on hand motor functional areas determined by resting state and task BOLD fMRI and anatomical MRI for pre-surgical planning of patients with brain tumors
AbstractFor pre-surgical planning we present quantitative comparison of the location of the hand motor functional area determined by right hand finger tapping BOLD fMRI, resting state BOLD fMRI, and anatomically using high resolution T1 weighted images. Data were obtained on 10 healthy subjects and 25 patients with left sided brain tumors. Our results show that there are important differences in the locations (i.e., >20mm) of the determined hand motor voxels by these three MR imaging methods. This can have significant effect on the pre-surgical planning of these patients depending on the modality used. In 13 of the 25 cases (i.e., 52%) the distances between the task-determined and the rs-fMRI determined hand areas were more than 20mm; in 13 of 25 cases (i.e., 52%) the distances between the task-determined and anatomically determined hand areas were >20mm; and in 16 of 25 cases (i.e., 64%) the distances between the rs-fMRI determined and anatomically determined hand areas were more than 20mm. In just three cases, the distances determined by all three modalities were within 20mm of each other. The differences in the location or fingerprint of the hand motor areas, as determined by these three MR methods result from the different underlying mechanisms of these three modalities and possibly the effects of tumors on these modalities
Voices for Two-Generation Success: Seeking Stable Futures
Findings from 10 focus groups with low and moderate income mothers, and teenage boys and girls
Comparing Bayesian Models of Annotation
The analysis of crowdsourced annotations in NLP is concerned with identifying 1) gold standard labels, 2) annotator accuracies and biases, and 3) item difficulties and error patterns. Traditionally, majority voting was used for 1), and coefficients of agreement for 2) and 3). Lately, model-based analysis of corpus annotations have proven better at all three tasks. But there has been relatively little work comparing them on the same datasets. This paper aims to fill this gap by analyzing six models of annotation, covering different approaches to annotator ability, item difficulty, and parameter pooling (tying) across annotators and items. We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators. We conclude with guidelines for model selection, application, and implementation
A Portable, Server-Side Dialog Framework for VoiceXML
ABSTRACT We describe a spoken dialog application framework that combines the power and flexibility of server-side Java Servlets and Java Server Pages (JSPs) with the deployment portability, reliability and scalability of standard web (HTTP) servers and VoiceXML clients. Applications are developed by extending a framework of Java classes in order to define dialogs through lower level actions such as speech recognition, audio prompting, speech synthesis, and backend data access. The framework delegates session data management to servlets, embedding frame-based representations for the application's global and session data. Dialog flow is controlled through general constructions such as loops, conditionals, scoped sub-dialogs, along with scoped command, error, and exception handling. Prompting and grammars are configured through simple JSP templates that generate the VoiceXML instructions for the server to return to the client. The framework is designed to be extensible, as demonstrated by the implementation of customizable backup and repeat commands integrated with session data, command handling and grammar scoping
- …